Search CORE

25 research outputs found

Fine-Grained Head Pose Estimation Without Keypoints

Author: Chong Eunji
Rehg James M.
Ruiz Nataniel
Publication venue
Publication date: 13/04/2018
Field of study

Estimating the head pose of a person is a crucial problem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment. Traditionally head pose is computed by estimating some keypoints from the target face and solving the 2D to 3D correspondence problem with a mean human head model. We argue that this is a fragile method because it relies entirely on landmark detection performance, the extraneous head model and an ad-hoc fitting step. We present an elegant and robust way to determine pose by training a multi-loss convolutional neural network on 300W-LP, a large synthetically expanded dataset, to predict intrinsic Euler angles (yaw, pitch and roll) directly from image intensities through joint binned pose classification and regression. We present empirical tests on common in-the-wild pose benchmark datasets which show state-of-the-art results. Additionally we test our method on a dataset usually used for pose estimation using depth and start to close the gap with state-of-the-art depth pose methods. We open-source our training and testing code as well as release our pre-trained models.Comment: Accepted to Computer Vision and Pattern Recognition Workshops (CVPRW), 2018 IEEE Conference on. IEEE, 201

arXiv.org e-Print Archive

Crossref

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

Author: Alayrac Jean-Baptiste
Hahn Meera
Laptev Ivan
Rehg James M.
Ruiz Nataniel
Publication venue
Publication date: 22/09/2018
Field of study

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision. The task is challenging due to the difficulty of bridging the semantic gap between the visual and natural language domains. This paper addresses the task of automatically generating an alignment between a set of instructions and a first person video demonstrating an activity. The sparse descriptions and ambiguity of written instructions create significant alignment challenges. The key to our approach is the use of egocentric cues to generate a concise set of action proposals, which are then matched to recipe steps using object recognition and computational linguistic techniques. We obtain promising results on both the Extended GTEA Gaze+ dataset and the Bristol Egocentric Object Interactions Dataset

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Modelling seasonal environmental preferences of tropical tuna purse seine fisheries in the Mozambique Channel

Author: López Jon
Nataniel Anildo
Soto-Ruiz M. (María)
Publication venue: 'Elsevier BV'
Publication date: 09/07/2021
Field of study

The spatial-temporal environmental preferences and biomass aggregation of tropical tuna from purse seine fishery in the Mozambique Channel (MZC) have barely been investigated. In this study, tuna biomass volume from Fish Aggregating Devices (FADs) and Free-Swimming Schools (FSC), collected by Spanish fishing logbooks during 2003–2013, were modelled separately as a function of a set of oceanographic variables (sea surface temperature, sea surface height, geostrophic currents, salinity, and chlorophyll-a) using Generalized Additive Models (GAMs). Temporal variables (natural day, month and year), and spatial variables (latitude and longitude) were included in the models to account for the spatio-temporal structure of dynamic biomass of tropical tuna volume gathering. Oceanographic, temporal and spatial effects on aggregated catches differed between fishing modes, even though some common aspects appeared along the area and the period of study. Fishable patches of tuna biomass accumulation were explained by sea surface temperature, productivity, sea surface height, geostrophic currents, and apart from the spatio-temporal variables interactions. Although the models predicted slight differences for tuna fishing spots preferences, both fishing modes partially overlapped. Goodness of fit for selected variables showed that models were able to predict tuna catches assembled patterns in the MZC reasonably well. These results highlight a connection between the biophysical state of the oceans and purse seine tuna catches in the MZC, and ultimately may contribute to the scientific advice for the appropriate management and conservation of the exploited resources by purse seine fleets in the area of MZC.Postprint1,58

Repositorio Institucional Digital del IEO

Platypus: Quick, Cheap, and Powerful Refinement of LLMs

Author: Hunter Cole J.
Lee Ariel N.
Ruiz Nataniel
Publication venue
Publication date: 14/08/2023
Field of study

We present

\textbf{Platypus}

, a family of fine-tuned and merged Large Language Models (LLMs) that achieves the strongest performance and currently stands at first place in HuggingFace's Open LLM Leaderboard as of the release date of this work. In this work we describe (1) our curated dataset

\textbf{Open-Platypus}

, that is a subset of other open datasets and which

\textit{we release to the public}

(2) our process of fine-tuning and merging LoRA modules in order to conserve the strong prior of pretrained LLMs, while bringing specific domain knowledge to the surface (3) our efforts in checking for test data leaks and contamination in the training data, which can inform future research. Specifically, the Platypus family achieves strong performance in quantitative LLM metrics across model sizes, topping the global Open LLM leaderboard while using just a fraction of the fine-tuning data and overall compute that are required for other state-of-the-art fine-tuned LLMs. In particular, a 13B Platypus model can be trained on

\textit{a single}

A100 GPU using 25k questions in 5 hours. This is a testament of the quality of our Open-Platypus dataset, and opens opportunities for more improvements in the field. Project page: https://platypus-llm.github.i

arXiv.org e-Print Archive